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Abstract 



The broad physics program of the KLOE experiment is based on the high event rate 
at the Frascati <j) factory, and calls for an up-to-date system for data acquisition and 
processing. In this review of the KLOE offline environment, the architecture of the 
data-processing system and the programs developed for data reconstruction and 
Monte Carlo simulation are described, as well as the various procedures used for 
data handling and transfer between the different components of the system. 
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1 Introduction 



KLOE is a general-purpose experiment permanently installed at the Frascati 
4> factory, DA$NE. The KLOE detector was designed for the study of CP 
violation in the neutral-kaon system. The versatility of the experiment allows 
for a rich physics program, including measurements of radiative decays, nu- 
merous decays of charged and neutral kaons, and measurement of the hadronic 
cross section, among other topics. 

The most interesting channels have branching ratios on the order of 10~ 3 or 
smaller. For precision measurement of these decays, the DA$NE collider has 
been designed to achieve a luminosity of 5 x 10 32 cm _2 s _1 . At this luminos- 
ity, the production cross section of about 3 fib translates into an event 
rate of 1.5 kHz. Bhabha events within the acceptance, together with machine- 
background and cosmic-ray events, contribute a similar amount to the total 
acquisition rate. The average KLOE event size is 2.7 kB. We therefore re- 
quire a data-acquisition (DAQ) system capable of handling a throughput of 
10 MB/s with high efficiency, a data-processing environment with file servers 
that provide bandwidth on the order of 100 MB/s, and a data-storage system 
capable of handling on the order of a petabyte of data. These numbers are 
similar to those for other major experiments currently running, and place the 
design and implementation of the DAQ and offline systems among the more 
challenging projects in the high-energy physics community. 

The high sensitivity needed for the study of CP-violation effects and quantum 
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interference patterns in the neutral-kaon system requires that experimental 
systematics be kept under strict control. To this end, billions of events must be 
generated, with the most accurate simulation possible of the detector response 
and machine-background effects. 

KLOE data taking for physics began in the year 2000. A total integrated 
luminosity of about 500 pb _1 was collected by the end of 2002. KLOE data 
collection is expected to resume at a rate of 10 pb _1 /day in 2004. 

In this paper, we discuss the KLOE offline data-processing system. We briefly 
describe the KLOE detector in Sec. 2. The main features of the data-processing 
environment and the operation of the computer farm are discussed in Sec. 3. 
The algorithms used in the reconstruction code and their implementation are 
described in Sec. 4. The KLOE Monte Carlo and its use in event-simulation 
campaigns is discussed in Sec. 5. In Sec. 6, we summarize and draw some 
conclusions from our experience. 



2 The KLOE detector 



For the discrimination of the CP-violating decays Kl — > n + n~ and Kl — > 
7r°7r° from the much more abundant Kl — > irfiu and Kl — > 3n° decays, we 
require of the detector good momentum resolution for charged tracks, as well as 
full solid-angle coverage and excellent energy and time resolution for photons. 
Moreover, given the rather long mean decay length of the K L at DA$NE 
(3.4 m), a large detector is required in order to have reasonable geometrical 
acceptance. 

The KLOE detector is composed of two subdetectors: a large drift chamber 
(DC) to measure charged tracks, and an electromagnetic calorimeter (EmC) 
to detect photons. Both are immersed in the 0.52 T field of a superconducting 
solenoid. 

The drift chamber [1] is a cylinder of 25 (198) cm inner (outer) radius and 332 
cm length; it contains 12 582 drift cells distributed in 58 cylindrical layers. For 
the 12 inner layers, the cell dimensions are 2x2 cm 2 , while for the 46 outer 
layers, they are 3x3 cm 2 . In order to provide uniform coverage throughout 
the chamber volume, all wires are stereo wires. The signs of the stereo angles 
(with respect to the beam axis) alternate from layer to layer, and the mag- 
nitude of the stereo angle for each layer gradually increases, from 60 mrad 
for the innermost layer to 150 mrad for the outermost. The total number of 
wires (sense + field + guard) is about 52 000. The spatial resolution in the 
r0 plane is about 150 //m; in the z direction, the spatial resolution depends on 
the stereo angle and is about 2 mm. The chamber is filled with a gas mixture 
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of 90% helium and 10% isobutane. This \ow-Z mixture has been chosen to 
reduce the effects of regeneration, photon conversion, and multiple scattering, 
where the latter has a particularly significant effect on the momentum res- 
olution for tracked particles given the momenta involved in the experiment 
(100-500 MeV/c). The transverse- momentum resolution is a Pt /p t ^,0A% for 
large-angle tracks. Vertices inside the chamber are reconstructed with a spa- 
tial resolution of ~3 mm. The chamber was recently instrumented with ADCs 
to supplement the experiment's particle-identification capability with dE / dx 
information for reconstructed tracks. 

The electromagnetic calorimeter [2] is of the sampling type, and is made of lead 
layers and scintillating fibers, with a volume proportion of lead: fib er:epoxy = 
42:48:10. The total thickness of the EmC is 23 cm, corresponding to about 
15 X . The EmC is composed of a barrel and two endcaps. The barrel is 
divided into 24 modules. Each endcap is divided into 32 (vertical) modules, 
which have a C shape to close the solid angle as much as possible. The light 
from the fibers is viewed by a photomultiplier tube (PMT) at each end to 
determine the time of flight and impact point along the direction of the fibers. 
The readout is segmented in depth into 5 planes (each 4.4 cm thick, except 
for the outermost, which is 5.2 cm thick), and in the coordinate transverse to 
the fibers into columns 4.4 cm wide. In all, there are 4880 PMTs. To complete 
the coverage of the solid angle, two small calorimeters, QCAL [3], made of 
lead and scintillating tiles, are wrapped around the \ow-f3 quadrupoles. The 
PMT signals (after an electronic delay of about 200 ns) are sent to ADCs for 
amplitude analysis, to TDCs for time-of-flight measurement, and to the trigger 
modules. The energy resolution for photons is ue/E = 5.7%/y/E(GeV) and 
the time resolution is o t = [54/y.E(GeV) © 50] ps. The photon impact point 
is measured with a precision of ~1 cm/ \/E(GeV) along the fibers and ~1 cm 
in the transverse coordinate. 

The trigger [4] is based on energy deposits in 88 calorimeter sectors (formed by 
grouping adjacent readout elements) and on drift-chamber signals. The level- 1 
trigger, which starts data readout with minimal delay, requires energy deposits 
above threshold (E > 50 MeV in the barrel, E > 150 MeV in the endcaps) in 
two EmC sectors, or ~15 DC wire signals within 250 ns. Low-angle Bhabha 
events can be downscaled at this level. The level-2 trigger, which validates the 
level-1 trigger, requires further multiplicity or geometrical conditions for EmC 
energy deposits, or ~120 DC wire signals within a 1.2 fis time window (the 
maximum drift time is 1-1.5 fis, depending on cell size). A cosmic-ray veto is 
applied at level 2. The acquisition dead time is about 2.7 /is (corresponding to 
a 0.8% loss at a typical rate of 3 kHz). A level-3 trigger filter is implemented 
in software to review and enforce the cosmic-ray veto decision made at level 
2. 

The trigger is synchronized with a demultiplied DA$NE radio-frequency signal 



4 



that corresponds to every fourth bunch crossing. (t sy nc = 4tbunch = 10.85 ns). 
The association of the event with the proper bunch crossing, or determination 
of the event-start time, is made during offline reconstruction. 

The DAQ system [5] handles about 23 000 front-end channels (ADC, TDC 
and trigger modules) hosted in VME crates organized in ten chains. Sub- 
events from each chain are sent through an FDDI switch to the online farm for 
event building, formatting, and monitoring. The online farm consists of seven 
IBM 7026-H50 SMPs, each with four 332-MHz PowerPC 604e processors. The 
online servers write the raw-data files to 1.4 TB of locally mounted SSA disks. 
The readout system has been designed for a sustained rate of 10 MB/s. At 
a typical luminosity of 5 x 10 31 cm~ 2 s _1 during 2002, the trigger rate was 
1.6 kHz and the average event size was 2.7 kB, leading to a sustained data 
acquisition rate of 4.3 MB/s, which was managed using three out of seven 
online nodes. 



3 The offline computing environment 

Raw data from the online systems are reconstructed on the KLOE offline 
farm. In this section, we first give an overview of the procedure by which 
raw data are reconstructed, divided into analysis streams, and then further 
reduced into data-summary tape (DST) streams. (Monte Carlo production 
is also performed on the offline farm; the processing of Monte Carlo events 
is described in Sec. 5.) We then describe the offline hardware environment, 
the data-handling system (which is common to both the online and offline 
environments), and the offline software environment. 



3. 1 Overview of data processing 

The event-builder processes running on the online farm write raw events to the 
online disk pool in 1-GB files. Data taking is divided into runs of approximately 
equal integrated luminosity (200 nb -1 in year 2002). Typically, about 20 raw- 
data files are written per run. For each run, the run number is used to uniquely 
associate to the events 

• a set of calibration constants; 

• values for machine parameters such as energy, beam position, etc.; 

• quantities related to the detector status such as high- and low-voltage set- 
tings, trigger thresholds, drift-chamber gas parameters, dead-channel lists, 
etc. 
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All data are permanently stored in a tape library as described in Sec. 3.3. Raw- 
data files are kept on disk until calibration and reconstruction are completed. 
The archival of raw-data files and the availability of free space on the online 
disk pool are managed by the data-handling system as described in Sec. 3.4. 

For the drift-chamber calibration [6] , two procedures are in use. The first and 
most commonly used procedure performs a fast analysis to test the valid- 
ity of the most recent values of the calibration constants. This program runs 
concurrently with data taking, using cosmic-ray events selected and buffered 
by the DAQ system. The second procedure performs a complete analysis of 
cosmic-ray muon tracks in the DC to update the calibration constants; it is 
launched only if the existing calibrations fail to describe the detector perfor- 
mance. This typically happens only a few times during an entire data-taking 
period, essentially when the atmospheric pressure changes by more than 1%. 
The drift-chamber calibration procedures are further described in Sec. 4.2. 

For the calorimeter, the calibration procedure [2] is started at the end of each 
run and lasts about two hours. The procedure uses Bhabha and 77 events 
selected by the DAQ system: the 500 MeV photons are used to set the absolute 
energy and time scales, while the higher-statistics sample of 500 MeV electrons 
and positrons allows the equalization of the energy scale between different 
calorimeter columns. With an integrated luminosity of 200 nb _1 , the time 
scale is determined to within 10 ps, and the energy scale is accurate at the 
percent level. 

Various other processes running on the online servers perform on-the-fly re- 
construction of selected events to monitor the status of the detector and data- 
taking conditions (such as hardware efficiencies, noise rates, machine energy, 
and beam-spot position). The slow-control system combines these data with 
hardware-status information (such as high- and low- voltage settings and dead- 
channel maps); it also receives information from the DA$NE control systems 
on machine parameters (such as beam currents and number of bunches) and 
sends information on the status of the experiment to the DA$NE operators. 
Monitoring information from all of these sources is summarized and written to 
the central KLOE database described in Sec. 3.4. Geometry files and calibra- 
tion constants, as well as some information on long-term detector conditions, 
are stored using the CERN HEPDB database [7]. 

Event reconstruction is performed on the offline farm. The reconstruction pro- 
gram datarec starts immediately after the completion of the calibration jobs 
for the run. Each of the 20 or so raw-data files making up the run are pro- 
cessed in parallel by a separate reconstruction job. Each job produces one 
reconstructed file for each analysis stream. 
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In practice, a single job manager periodically interrogates the database, iden- 
tifies new runs ready for processing, and starts jobs on the free CPUs of the 
offline farm. The status of these jobs and the overall status of the offline farm 
itself are monitored via the web interface to the slow-control system. The re- 
construction jobs provide additional data-quality and monitoring information, 
a summary of which is available from the slow-control web interface. 

The reconstruction program DATAREC consists of several modules that per- 
form the following tasks: 

• loading of DC and EmC calibration constants; 

• EmC cluster reconstruction from single cells and determination of deposited 
energy and time of flight; 

• determination of the correct bunch crossing; 

• rejection of machine-background and cosmic-ray events; 

• pattern recognition and track fitting for charged particles in the DC; 

• vertex reconstruction for charged particles; 

• association of DC tracks with EmC clusters; 

• event classification. 

The algorithms developed for these tasks are described in Sec. 4. 

The processing path for event reconstruction has been designed to filter out 
machine-background and cosmic-ray events at an early stage, before tracking 
in the DC, which is the most CPU-intensive reconstruction task. The filter 
algorithm, FILFO, is based only on information from the EmC, and is able to 
cut out a significant portion of background events. 

For easier and faster access to the data sample, the last step of the recon- 
struction procedure is the classification of events on the basis of topological 
information into different files (or streams), to be used for different physics 
analyses. Currently, five streams are defined, containing Bhabha scattering 
events, decays into charged kaons, decays into neutral kaons, — > 7r + 7r~7r° 
decays, and radiative decays. 

The latter four streams undergo a further level of data reduction, in which 
only the information used in the final stages of physics analysis is retained. 
The resulting set of data-summary tapes (DSTs) is about six times smaller in 
size than the corresponding set of reconstruction output files, and can be kept 
largely on disk for easy access by any user program. DST production is auto- 
matically launched once a run has been completely reconstructed. Besides data 
reduction itself, other tasks needed for the optimization of the reconstruction 
of each stream are performed during DST production. For example, a refined 
track fit is performed for events containing charged kaons. This fit properly 
uses the kaon mass in the treatment of energy loss and multiple scattering for 
identified kaon tracks. 
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Fig. 1. KLOE computing hardware configuration during 2001-2002. 



Because of the continuous improvement in our understanding of the perfor- 
mance of the detector and the increasing statistical sensitivity afforded by the 
growth of the data set, the calibration procedures and reconstruction algo- 
rithms are in constant evolution. To allow physics analyses to benefit from 
the corresponding improvements in reconstruction quality, we periodically re- 
process raw data that was originally processed with an earlier version of the 
reconstruction code. During the first four months of 2002, the data sample of 
~180 pb -1 collected in 2001 was completely reprocessed to include improve- 
ments to the timing calibration of the calorimeter, the background filter, and 
the selection criteria for charged and neutral kaons. The 2001 and 2002 data 
were thus reconstructed using an identical path and homogeneous code. 



3.2 Offline J 'arm 



The configuration of the KLOE computing hardware is schematically repre- 
sented in Fig. 1. 

The offline farm consists of a mix of IBM 7026-B80 SMPs running AIX, each 
with four 375-MHz Power3 CPUs; and Sun E450 SMPs running Solaris, each 
with four 400-MHz UltraSPARC II CPUs. In all, 23 B80s and 10 E450s are 
available, and provide a total processing power equivalent to about 110 of the 
processors installed in the B80s, or about 30 000 SPECint2000. 

The CPU time needed for data reconstruction and simulation is summarized 
in Table 1. Here and throughout this paper, all CPU times are referred to a 
single processor on one of the B80 servers. The CPU time needed for data 
reconstruction depends on the effectiveness of the filfo filter in rejecting 
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lasK 


CPU time/event (ms) 


(^r<J time/ id (uaysj 


Data reconstruction 


20 


9600 


Data simulation ((f) decays) 


200 


6650 


Monte Carlo reconstruction 


175 


5100 



Table 1 

CPU-time consumption for reconstruction and Monte Carlo simulation on the 
KLOE offline farm. All CPU times refer to a single processor on one of the B80 
servers. 



background events in the presence of variable data-taking conditions. The 
entries in the table reflect the data-taking conditions in 2002, when filfo was 
able to reduce the input rate by 60%. Such events are rejected immediately 
after reconstuction in the EmC, which takes only 5 ms. For events passing the 
filter, DC reconstruction takes about 40 ms, where this number is a sample- 
weighted average of the reconstruction times for Bhabha events (~30 ms), 
0-decay events (~120 ms), and a small fraction of unrejected background 
events (15-40 ms). Averaged over all input events, then, the time needed to 
reconstruct an event is 20 ms. 

Currently, about 80% of the processing power is used for production-related 
tasks; the remainder is allocated to physics analysis tasks. Additional machines 
can be opened to user batch and interactive sessions as the need arises. In this 
configuration, the total processing power allocated to production is adequate 
for the purposes of data reconstruction in parallel with acquisition. Fig. 2 
illustrates the progress of the 2002 data-taking campaign. The growth of the 
reconstructed data set closely follows that of the acquired data set. From 
the point of view of both hardware and software, the operation of the offline 
systems is seen to be smooth and reliable. 

The time needed for DST production varies from stream to stream. This is 
in part because of the different abundances of selected events, and in part 
because the algorithms applied vary in CPU intensity (as noted in Sec. 3.1, 
K + K~ events are completely re-reconstructed at the DST production stage). 
DST-production rates range from 50 nb _1 /CPU hour for the K + K~ stream, 
to 600 nb _1 /CPU hour for the radiative 0-decay stream. Processing of all four 
streams proceeds at 40 nb _1 /CPU hour. 

During the past three years of operation, the power of the offline farm has 
grown in parallel with the demands of the experiment, from 16 B80 CPU 
equivalents in the year 2000, to the 110 currently available. As part of an 
offline-system upgrade for the year 2004, ten new IBM p630 servers, each 
with four 1.45-GHz Power4+ processors, are currently being installed. This 
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Fig. 2. a) Integrated luminosity per week in 2002. b) Total integrated luminosity 
vs. data-taking week in 2002. Histograms refer to the data taking; triangles refer to 
the reconstructed sample. 

increases the total CPU power of the offline farm to about 225 B80 equivalents, 
or about 60 000 SPECint2000. The upgrade will provide CPU power sufficient 
for reconstruction, DST processing, and Monte Carlo production, simultane- 
ously and in parallel with the acquisition of data at an average luminosity of 
1 x 10 32 cm- 2 s _1 . 



3.3 Data storage, data access, and networking 

Data are permanently stored in an IBM 3494 tape library. The library has 
12 Magstar 3590 tape drives which can read and write at 14 MB/s, dual ac- 
tive accessors, and space for about 5400 60-GB cartridges, for a maximum 
capacity of about 324 TB. The library is maintained using IBM's Tivoli Stor- 
age Manager [8]. The library usage is summarized in Table 2. Note that the 
specific volume of the raw data (TB/pb -1 ) decreases from year to year be- 
cause of background reduction due to better software filters and improved 
DA$NE operations. During the running period scheduled for 2004, we expect 
that DA$NE upgrades recently completed will allow us to collect a data set 
of about 2 fb _1 . To store the new data, we will need at least an additional 
300 TB of long-term storage capacity. To satisfy this need, we are currently 
in the process of ordering a second tape library. 

A 6.3-TB offline-disk pool is used for data transfers to and from the library. 
The disk pool consists of 4.0 TB of Fibre Channel (FC) and 2.3 TB of SSA 
disks, configured in striping mode. Two IBM 7026-H80 SMPs running AIX, 
each with six 500-MHz RS64-III CPUs and 2 GB of RAM, locally mount the 
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Table 2 

KLOE tape library usage at the end of 2002. The entries for DSTs include MC 
DSTs. DSTs were not produced for the 2000 data. A total of 184 TB are currently 
occupied. 

offline-disk pool and tape library and are used as file servers. With the two file 
servers working in concert, aggregate I/O rates of over 100 MB/s have been 
obtained. 

Analysis jobs usually use DSTs as input. For the 2001-2002 data, the set of 
DSTs occupies 4 TB; MC DSTs occupy an additional 3 TB. About 5.5 TB 
of the offline disk pool is used to cache files recalled from the tape library 
by the data-handling system; copies of the bulk of the DSTs reside in this 
cache for prompt access. The output from analysis jobs is written to user and 
working-group areas on the KLOE AFS cell. The AFS cell is served by two 
IBM 7026-H70 SMPs, each with four 340-MHz RS64-III CPUs, 850 GB of 
SSA disks, and 250 GB of FC disks, for a total cell capacity of 2.2 TB. Users 
can access the AFS cell from PCs running Linux on their desktops to perform 
the final stages of their analyses. 

Network connections are routed through a Cisco Catalyst 6000 switch. The 
file and AFS servers are connected to the switch via Gigabit Ethernet. Con- 
nections to all other nodes are via Fast Ethernet. 



3.4 Data handling 

A diagram of the data-handling scheme is presented in Fig. 3. 

When new data are acquired, the online servers write the raw files to the 
online-disk pool. These files are then asynchronously archived to the tape 
library over an NFS mount by the archiver daemon. The archiving processes 
are tailored to minimize the number of tape mounts while guaranteeing enough 
space on the disk pool. 

Normally, reconstruction is performed while the raw files are still resident on 
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Fig. 3. Schematic layout of KLOE data handling. 



disk. For input to the reconstruction processes from the online disk, events are 
either read across an NFS mount or served by the data-handling system using a 
custom TCP/IP protocol, which is provided by the KLOE Integrated Dataflow 
package (kid) [9]. Reconstruction output is written via NFS to the offline-disk 
pool, from which it is asynchronously archived to tape. DSTs for each run are 
produced from the reconstruction output files, usually immediately after the 
run has been completely reconstructed. In this case, the reconstructed events 
may be read back in across the NFS mount for DST production. When files 
already archived and deleted from the online- or offline-disk pools must be 
processed on the offline farm, the recalld daemon restores the files from tape 
to the recall disk cache, from where they are served to the offline processes 
using the kid protocol. The spacekeeper daemon ensures the availability of 
disk space in the staging areas by deleting files that have been archived. The 
successful completion of calibration, reconstruction, and archival are signaled 
by flags in the database (see below). 

The same model for data access used for reconstruction applies to user analysis 
jobs running on the offline farm. In principle, users may need to analyze raw, 
reconstructed, or DST files. If the files requested are resident on the online- 
or offline-disk pools, they are copied to the recall disk cache by recalld to 
be served to the user processes; otherwise, they are restored to the recall disk 
cache from tape. A filekeeper daemon ensures the availability of free space 
in the recall areas, deleting old files when necessary to make space for newly 
recalled data. 

A central database based on IBM's DB2 [10] is used to keep track of the 
locations of the several million files comprising the data set [11]. Each file 
is logged in the database when it is created. The database entry contains 
the reconstruction status of the file, allowing files that require processing to 
be easily identified. This database also contains run-by-run information on 
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data-taking conditions and operational parameters of the detector, as noted 
in Sec. 3.1. 

The backbone of the data-handling system is the kid package, which con- 
sists of two pieces: a centralized data-handling daemon, which coordinates 
the distributed file-moving services; and a client library, with an easy-to-use 
URL-based interface that allows access to files independent of their locations. 
kid URLs may incorporate SQL queries used to interrogate the file database. 
Examples of such URLs include: 

• All raw files in the stated run range that have not yet been reconstructed: 
dbraw:run_nr between 23000 and 24000 and analyzed is not null 

• All reconstructed files in the K S K L stream for a given run: 
dbdatarec:run_nr = 23015 and stream_code = ksl 

3.5 Software environment 

The DATAREC program is built upon the framework provided by the ANALY- 
SlS_CONTROL (a_c) package developed at FNAL [12]. A_c provides the tools 
for building the executable from KLOE analysis modules, as well as a user 
interface that allows the processing sequence and choice of enabled streams 
to be specified at run time. In order to use A_c in the KLOE environment, 
numerous customizations of the library have been implemented; in particular, 
the kid package (Sec. 3.4) has been seamlessly interfaced. The source code 
versions for analysis modules used in the datarec program are tracked using 
cvs [13]. 

The data format consists of independent collections of tabular data structures, 
or banks, for each event. They are read and written using the YBOS package 
[14] , which provides tools for platform-independent memory management and 
for the definition of tabular data structures that can be manipulated in Fortran 
code. 

An interface to the ZLIB library [15] has also been added to A_C to allow reading 
and writing of compressed data. The compression/decompression routines are 
transparently called from A_c internals. A compression factor of about 0.6 is 
obtained for reconstructed output. 

3.6 Analysis considerations 

In addition to production jobs, user analysis jobs also run on the offline farm. 
In 2003, about 20% of the offline CPU power was available to users for the 
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production of histograms and Ntuples. About two-thirds of the machines open 
to user sessions were reserved for batch jobs, with queues managed by IBM's 
LoadLeveler [16]. 

As an example of the execution time for user jobs, consider the analysis of 
the 2001-2002 K S K L data set, which consists of 4.5 x 10 8 events in 1.4 TB of 
DSTs, the majority of which are resident on disk in the recall disk cache for 
prompt access. With six batch jobs running in parallel (the default per-user 
maximum), the entire data set can be analyzed in six days elapsed. The output 
size ranges from 10 to 100 GB, which can be accessed in situ on the AFS cell 
or copied off to a user's desktop PC. 



4 Reconstruction program and algorithms 

4-1 Reconstruction algorithms for the drift chamber 

The track-reconstruction algorithms [17] are based on the program developed 
for the ARGUS drift chamber [18]. This program has been adapted to the 
all-stereo geometry of the KLOE DC and tuned to the specific topology of 
KLOE events to optimize the efficiency of vertex reconstruction throughout 
the DC volume. The detailed DC geometry, the space-time (s-t) relations 
for the different types of drift cells, and the map of the magnetic field are 
described in detail in the database. Event reconstruction is performed in three 
steps: 1) pattern recognition, 2) track fitting, and 3) vertex fitting. Each step 
is handled separately and produces the input information for the subsequent 
step; this information is stored in YBOS banks. 

The first step of the track- reconstruction chain is pattern recognition (PR). 
The PR algorithm searches for track candidates and provides rough estimates 
of their parameters. Track segments are first searched for in the xy plane; 
then the z projections are obtained. In an axial drift chamber, the particle 
trajectory in the xy plane is well approximated by a circle (except for cor- 
rections due to energy loss and multiple scattering, which are negligible at 
the PR stage). In the KLOE DC, since the wires are strung with a stereo 
angle, a particle leaves a pattern that appears as two nearby circles, one for 
each stereo view. The PR algorithm first searches for track candidates in each 
stereo view. Starting from the outermost layer, hit chains are built up by as- 
sociating hits close in space on the basis of curvature compatibility. In order 
to resolve left-right ambiguities, a minimum of four hits in at least two wire 
layers are required to create a single-view track candidate. 

At the end of the hit-association stage for each view, a filter exclusively assigns 
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hits shared between track candidates to the better candidate. Each track can- 
didate is then fitted and its parameters are computed. The track candidates 
from the two views are then combined in pairs according to their curvature 
values and geometrical compatibility. Finally, the z projection for each pair 
is determined from a three-dimensional fit to all associated hits. At the PR 
stage, the magnetic field is assumed to be homogeneous, multiple scattering 
and energy loss are not treated, and rough s-t relations (see Sec. 4.2) are used. 

The track-fitting (TF) procedure minimizes a x 2 function based on the com- 
parison between the measured and expected drift distances for each hit. Re- 
current tracing relations are used at each step to determine the positions of 
successive hits from the estimated track parameters and the rough s-t rela- 
tions; the drift distance is then corrected using more refined s-t relations that 
depend on the track parameters. Drift distances are recalculated with each 
iteration of the fit to make use of the previous determination of the track 
orientation with respect to the cell. 

Tracks are described by connected helical segments. Local variations in the 
magnetic field are taken into account at each step, together with the effects of 
energy loss and multiple scattering. The momentum loss between consecutive 
hits is computed assuming the pion mass. Multiple scattering is accounted for 
by dividing the track into segments such that the estimated transverse dis- 
placement due to multiple scattering over the length of the segment is smaller 
than the spatial resolution. The values of the effective scattering angles in the 
transverse and longitudinal planes are then treated as additional parameters 
in the track fit. 

After a first iteration, a number of procedures improve the quality of the track 
fit. In particular, dedicated algorithms are used to 

• check the sign assignment of the drift distance hit by hit; 

• add hits that were missed by the PR algorithm; 

• reject hits wrongly associated to the track by the PR algorithm; 

• identify split tracks and join them; 

• identify kinked tracks and split them. 

As an example of the performance of the TF procedure, in Fig. 4 we illustrate 
the momentum resolution for Bhabha events as a function of the polar angle 
9. Over a large range in 9, a p /p is ~0.3%. The deterioration of the resolution 
at low angle is in accordance with the expected cot 9 behavior. 

At the end of the DC-reconstruction chain, the tracks from the TF procedure 
are used to search for primary and secondary vertices. For each track pair, a 
X 2 function is evaluated from the distances of closest approach between tracks; 
the covariance matrices from the TF stage are used to evaluate the errors. The 
vertex position is determined by minimizing this x 2 ■ To reduce the number of 
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Fig. 4. Momentum resolution cr p /p as a function of polar angle 9, for Bhabha events. 

combinations, the tracks are first extrapolated to the beam-crossing point in 
the transverse plane and primary vertices are searched for using tracks with 
an impact parameter smaller than 10% of their radius of curvature. Secondary 
vertices are then searched for among tracks not associated to any other vertex. 
For tracks that intersect the beam-pipe or inner DC walls, in the extrapolation, 
the track momentum is corrected for energy loss and the effect of multiple 
scattering is taken into account in the covariance matrix. The pion mass is 
assumed for the evaluation of these corrections. 

For vertices inside the beam pipe, the vertex-position resolution is about 2 mm 
in x, y, and z. In Fig. 5, we show the distribution of the vertex-position 
residuals in x for MC K$ — > 7r + 7r~ decays. The tttt invariant-mass distributions 
for Ks — > 7T + 7r _ decays in data and MC samples are compared in Fig. 6. The 
mass resolution for this decay is seen to be ~0.8 MeV/c 2 . 

Work is in progress on an algorithm to calculate the specific ionization dE/ dx 
for reconstructed tracks on the basis of the charge measurements from the 
ADCs recently added to the DC readout electronics. 

4-2 Calibration of the space-time relations 

Several effects influence the time response of the KLOE DC. The drift velocity 
of the helium-based gas mixture does not saturate with the electric field, so 
the relation between the drift time and the impact parameter of the track is 
not linear. Moreover, due to the geometry of the drift cells, the electric field 
configuration changes along the wire. This effect produces a dependence of the 
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Fig. 5. Distribution of vertex-position residuals in x for MC K$ — ► tt + -k~ events. 
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Fig. 6. Invariant mass distributions for K$ — > 7r + 7r~ events. Points and histogram 
show the distributions for data and MC events, respectively. 

space-time (s—t) relations upon the orientation of the track and its position 
along the wire. 

Simulations have shown that the s—t relations can be parameterized in terms 
of the angles (3 and defined in Fig. 7 [19]. Six cells with different values of (3 
have been chosen as reference cells. For each reference cell, the s—t relations 
are parameterized for 36 bins in 0, each 10° wide. Since only the upper half of 
the cell is deformed, in 20 of the bins in 0, the s—t relations are the same for 
all six reference cells. There are therefore a total of 16 x 6 + 20 = 1 16 param- 
eterizations for the small cells, and 116 for the large cells. Each s-t relation 
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Fig. 7. Definition of the angles (3 and 4>. 



is represented as a 5 th -order Chebyshev polynomial [20], tdrift = Pcheb{C^, d), 
where tdrift is the measured time, d is the impact parameter, and the 6 x 232 
coefficients Cf (k = 1, . . . , 232 and % = 0, . . . , 5) parameterize the "fine" s-t 
relations as described above. 



The s-t relations are determined using cosmic-ray events, which illuminate 
the chamber volume nearly uniformly and cover the entire range in the angle 
4>. At the PR level, the values of and (3 for each cell are unknown, since the 
trajectory of the particle has not yet been determined. At this level, the cell 
response is therefore described by a single s-t relation, which is an average 
over all track orientations and drift-cell shapes. This "raw" s-t relation is 
parameterized by the sum of three polynomials. 

There are four contributions to the signal arrival time for each wire: 

t = t T QF + twirc + Adrift + to- (1) 



Here, £tof is the particle time of flight up to the wire hit, t wirc is the propaga- 
tion time of the signal along the wire, tdrift is the drift time, and t is a time 
offset. The offsets t are calculated using cosmic-ray events at the beginning 
of each data-taking period (i.e., every few months), or whenever the readout 
electronics are reconfigured. About 10 7 events are required in order to obtain 
the t estimates. The tdrift + terms are isolated by computing txoF and t wirc 
event-by-event, approximating cosmic-ray tracks by straight lines [6]. 

Calibration of the s-t relations is performed by an iterative procedure which 
reconstructs tracks, checks the residuals (the difference between the impact 
parameters estimated using the existing s-t relations and those given by the 
track fit), and, if required, produces a new set of calibration parameters. The 
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Fig. 8. Spatial resolution as a function of the impact parameter for small (left) and 
large (right) cells. 



procedure starts by reconstructing a calibration sample (typically, cosmic-ray 
events) with the standard PR and TF algorithms. The mean residuals as a 
function of reconstructed impact parameter are then obtained for each set 
of hits corresponding to each of the 232 s-t relations. The impact parameters 
estimated from the drift time of each hit are then corrected by the correspond- 
ing value of the mean residual, and the tracks are reconstructed again. The 
iteration is halted when for each of the 232 parameterizations, the corrections 
are smaller than 40 /im for hits in the central part of the drift region of their 
cells. Finally, the 232 fine s—t relations are fitted, and the new coefficients Cf 
are calculated. 

The calibration program is incorporated into the KLOE online system. A syn- 
chronous procedure automatically starts at the beginning of each run, and se- 
lects 80 000 cosmic-ray events from the event-building nodes using kid. These 
events are then tracked using the existing s-t relations, and the absolute value 
of the average of the residuals for hits in the central part of the drift region 
is monitored. If this value exceeds 40 /im, 300 000 cosmic-ray events are col- 
lected, and the asynchronous procedure described above produces a new set 
of calibration constants. Depending on background conditions, the filters on 
the farm select events at a rate between 25 and 30 Hz. The event collection 
therefore takes therefore about 3 hours, and a comparable amount of time 
is needed for the analysis [6]. A complete recalibration is only necessary a 
few times per data-taking period, essentially when the atmospheric pressure 
changes by more than 1%. 

Fig. 8 shows the resolution averaged over all wires as a function of the recon- 
structed impact parameter. The spatial resolution is better than 200 /im over 
a large part of the drift cell. 
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4-3 Momentum calibration 

The calibration of the absolute momentum scale was performed with the 
2.4 pb _1 data sample collected in 2000 [21], in parallel with a survey of the 
mechanical distortions of the chamber and calibration of the space-time rela- 
tions. Two- and three-body processes such as e + e~ — > e + e~, e + e~ — > 
e + e~ — > 7T + 7r~7r°, K s — > 7r + 7r _ , fT^ - ► vr + 7r _ , 1^ — > niu, K L — > 7r + 7r~7r°, 
JC ± — > 71^71-°, and — > n^v were employed. Depending on the process, the 
invariant mass, missing mass, or secondary momentum in the rest frame of 
the decaying particle was reconstructed; deviations from the nominal values of 
these quantities were used as benchmarks for the calibration procedure. This 
approach allowed the investigation of distortion effects over the entire volume 
of the detector and the full range of momentum. Initially, the reconstructed 
momenta of low-angle Bhabha electrons deviated from the expected values 
by as much as 8 MeV/c. In general, the deviations showed a complex depen- 
dence on momentum, polar angle, azimuthal angle, and production point of 
the tracked particle. 

Two main sources of distortions were identified: 

(1) Measurement artifacts in the magnetic-field map 

The magnetic field was mapped with a mechanical system for the posi- 
tioning of an array of Hall probes at nominal field values of 0.3, 0.45, 
and 0.6 T before the DC was inserted into the solenoid [22]. In 2001, 
the maps were reexamined, with extra terms introduced to account for 
distortions due both to misalignment of the Hall probes with respect to 
their nominal positions on the arm spanning the solenoid volume, and to 
rotations and translations of the arm with respect to its nominal position 
in the KLOE reference system. Most of these geometrical effects could 
be isolated because the measurements were performed twice: first with 
the measurement arm moving from one end of the solenoid to the other, 
and then in the opposite direction, with the orientation of the measur- 
ing device reversed. Artifact field components thus appeared in the sum 
or difference of measurements performed by the same probe or by two 
neighboring probes. The typical size of artifact field components in the 
transverse plane was about 0.004 T. 

(2) Saturation of the magnetic field 

For optimum DA$NE performance, KLOE must work at a nominal field 
value of about 0.52 T. A comparison of the maps at the three different 
nominal field values showed evidence for saturation. The effect was also 
found in a set of very precise measurements of the field as a function of 
current performed on the solenoid axis using an NMR probe. The NMR 
data showed deviations from linearity as large as 1%, increasing with 
distance from the solenoid axis and decreasing with distance from the 
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endplates. Global corrections for the saturation of the longitudinal field 
component were applied using the shape of the excitation curve obtained 
by the NMR probe; local corrections were applied by interpolation of the 
three maps. Unfortunately, global saturation corrections for the trans- 
verse field components could not be computed. These corrections are 
thought to be on the order of 0.001 T in magnitude. 

With these corrections, low-angle Bhabha electrons are reconstructed with 
systematic momentum deviations of less then 500 keV/c, or approximately 
0.1%. Similar accuracy is found for all benchmark modes. The residual sys- 
tematic differences can be ascribed to interpolation error in the saturation 
correction. 



4-4 Reconstruction algorithms for the calorimeter 



The calorimeter is segmented into 2440 cells, which are read out by PMTs at 
each end (referred to as sides A and B in the following). Both charges Qaek: 
and times £tdc are recorded. For each cell, the particle arrival time t and the 
impact point s along the fiber direction are reconstructed using the times at 
the two ends as 



t A 



t = \(t A + t 1 
s = \{t A — t B — t A + t B ), 



L_ 

2i>> 



(2) 



with t A,B = c A,B t T ^ c , where c A ' B are the TDC calibration constants, t A,B are 
the overall time offsets, and L and v are the cell length and the light velocity 
in the fibers. The impact position in the transverse direction is provided by 
the locations of the readout elements. 



The energy signal Ei on each side of cell i is determined as 
S- 

E { = K E gi(s)—^, (3) 



where S = Qadc ~ Qo, adc is the charge collected after subtraction of the zero- 
offsets, and ^mip is the response to a minimum-ionizing particle crossing the 
calorimeter center. The correction factor g(s) accounts for light attenuation 
as a function of the impact position s along the fiber, while ke is the overall 
energy scale factor. The final value of Ei for the cell is taken as the mean of 
the determinations at each end. 
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The calibration constants related to minimum-ionizing particles, S m i P and g, 
are acquired with a dedicated trigger before the start of each long data-taking 
period. The time offsets t ' B and the light velocity v in the fibers are evaluated 
every few days using high-momentum cosmic rays selected using drift-chamber 
information. In this iterative procedure, the tracks reconstructed in the drift 
chamber are extrapolated through the calorimeter, and the residuals between 
the expected and measured times for each cell are minimized. Finally, a pro- 
cedure to determine the value of k e and to refine the values of t$ ,B runs 
online [5]; it uses Bhabha and e + e" — > 77 events to establish a new set of 
constants each 100-200 nb _1 (i.e., approximately every 2 hours during nor- 
mal data taking). The procedures used to calibrate the calorimeter are further 
discussed in Ref. 2. 

Calorimeter reconstruction starts by applying the calibration constants to con- 
vert the measured quantities Qadc and £tdc to the physical quantities S and t. 
Position reconstruction and energy/time corrections vs. s are then performed 
for each fired cell. Next, a clustering algorithm searches for groups of cells 
belonging to a given particle. In the first step, cells contiguous in r<p or xz 
are grouped into pre-clusters. In the second step, the longitudinal coordinates 
and arrival times of the pre-clusters are used for further merging and/or split- 
ting. The cluster energy, E c \, is the sum of the energies for all cells assigned 
to a cluster. The cluster position, (x,y,z) c \, and time, t c \, are computed as 
energy-weighted averages over the contributing cells. Cells are included in the 
cluster search only if times and amplitudes are available on both sides; oth- 
erwise, they are listed as "incomplete" cells. The available information from 
most of the incomplete cells is added to the existing clusters at a later stage 
by comparison of the (x, y) positions of such cells with the cluster centroids. 

The production of fragments from electromagnetic showers has been studied 
by comparing data and Monte Carlo samples of e + e~ — > 77 events, with tight 
selection cuts applied to the two highest-energy clusters in the event (the 
"golden photons"). The distribution of the minimum distance |Ax| between 
the golden photons and any of the other clusters is characterized by reason- 
able agreement between data and MC at large values of |Ax|; at low values 
of I Ax I an appreciable discrepancy is observed. In this latter case, a simi- 
lar discrepancy is observed for the distribution of the difference in time, At, 
between the selected clusters. The multiplicity of fragments in data exceeds 
that in MC events by about a factor of two and is dominated by clusters with 
energy below 50 MeV. We attribute these discrepancies to small inaccuracies 
in the descriptions of the shower development and time response in the Monte 
Carlo, so that the longitudinal cluster-breaking procedure performs differently 
for data and MC events. For this reason, depending upon the multiplicity of 
photons in the event, a split-cluster recovery procedure is applied at the anal- 
ysis level to merge close clusters depending on their values of |Ax|, At, and 
energy. 
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The energy, timing, and position resolutions for photons are measured using 
e + e~ — > 7t + 7t~7t° and radiative Bhabha samples. In both cases, the energy 
E 1 and direction p 7 of one of the photons are predicted with high precision 
using only tracking information. The calorimeter response and resolution as 
a function of the photon polar angle # 7 and energy E 1 can therefore be pa- 
rameterized, and the photon detection efficiency can be measured with high 
accuracy. 

The energy response as a function of E 7 shows a linearity better than 1% 
down to 60 MeV, while a drop in the response of ~3% is observed at low en- 
ergy. This is mostly due to imperfect recovery of the "incomplete" cells. The 
energy resolution is dominated by sampling fluctuations and is well parame- 
terized as 5.7%/y/E(GeV). The light yield has been estimated by looking at 
the fluctuations in the ratio of the energy response at each side, E A /E B , and 
corresponds to ~700 photoelectrons per side for 1 GeV photons impinging at 
the center of the calorimeter [a^/E, p.e. stat. ~ 2.7%/ \/E '(GeV)]. The timing 
resolution has also been determined; the stochastic term is dominated by the 
light yield, and scales as 54 ps/ v /£ , (GeV), while a constant term of 140 ps 
must be added in quadrature to account for the jitter introduced by rephasing 
the KLOE trigger with the machine RF. The contribution due to the preci- 
sion of the channel-by-channel calibration is estimated to be ~50 ps. In the 
transverse coordinates, the position resolution is dominated by the readout 
granularity, and is ~4.4/-^/12 cm, while in the longitudinal coordinate, s, it 
shows the expected 1.2 cm/ ^E(GeV) energy dependence. The reconstruction 
of the masses of neutral mesons (tc°, r), K$, Kl) decaying to n-photon final 
states shows that, at KLOE energies, the mass resolution is completely dom- 
inated by the energy resolution, while the mass scale is set with an accuracy 
better than 1%. In Fig. 9, we compare the distributions of reconstructed tt° 
and Ks masses for K$ — > 7r°7r° events from data and MC. 
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4-5 Determination of the absolute time scale and event-start time 

To run at the design luminosity, DA$NE can operate with 120 bunches per 
ring, which corresponds to a bunch-crossing period equal to the machine RF 
period, £rf = 2.715 ns. Due to the large spread of the particle arrival times 
and short bunch-crossing period, the trigger time does not identify the bunch 
crossing that produced an event; the time at which this bunch crossing oc- 
curred must therefore be determined offline. In order not to spoil the excellent 
EmC time resolution, the start to the TDC system is obtained by synchro- 
nizing the level-1 trigger with a clock that is phase-locked to the DA$NE 
radiofrequency signal. The clock period is 4£rf = 10.85 ns. The calorimeter 
times are measured in common-start mode and are given by the TDC stops 
from the discriminated PMT signals: 

tcl = t T OF + 0~c - N B C tRF, (4) 

where £tof is the time of flight of the particle from the event origin to the 
calorimeter, S c is the sum of all offsets due to electronics and cable delays, and 
Nbc ^rf is the time needed to generate the TDC start (see Fig. 10). 

The quantities 5 C and £rf are determined using e + e~ — > 77 events. For such 
events, the distribution of Atof — t c \ — r c i/c shows well-separated peaks corre- 
sponding to the different values of Arc for events in the sample (see Fig. 11a). 
We define 5 C as the position of the largest peak in the distribution, and obtain 
tRF from the distance between peaks. This is done by calculating the dis- 
crete Fourier transform of the A TO f distribution and fitting the peak around 
v = 1/£rf (see Fig. lib). The absolute TDC time scale is obtained by impos- 
ing £ RF (fit) = £rf- Both 5 C and £rf are determined with precision better than 
4 ps for every 200 nb _1 accumulated. 

While measuring the ratio BR(K S — > 7r + 7r~)/BR(A's' — > 7r°7r°), we found 
it necessary to apply an absolute correction of ~0.8% to the time scale to 
eliminate an observed dependence of f3* K on the trigger-formation time [23,24]. 
The error on the time scale was found to originate from two cooperating effects: 

• As seen from the distribution of Atof as a function of z c \ in Fig. 11c, the 
characteristic value of Abc in e + e~ — > 77 events varies as a function of 
longitudinal position along the barrel. This is due to the light-propagation 
time in the fibers, which is the dominant delay in trigger-signal formation. 

• Because of a residual slewing effect, for any given value of A BC , A TO f de- 
pends on z c \, as seen from Fig. lid. 

When taken together, these two effects lead to an error in determining the 
distance between the peaks in the A TO f distribution. Since 2001, we have 
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Fig. 10. Timing scheme for bunch-crossing signal, calorimeter signals, and level-1 
trigger formation. 



corrected for the dependence of Atof on z c \ using an ad hoc procedure before 
calibrating the calorimeter. This provides a stable —0.7% correction to the 
time scale. 

Since we want the cluster times to correspond to particle times of flight, a 
time offset to, evt = — ^bc^rf must be subtracted from all cluster times 
(see Eq. [4]). The trigger- formation time N~bc ^rf varies on an event- by-event 
basis; it is determined offline at different points of the reconstruction path. A 
zeroth-order value for Nbc (and hence to, evt) is obtained by assuming that the 
earliest cluster in the event is due to a prompt photon from the interaction 
point. By imposing t TO F = r c\/ c for this cluster, we obtain 



^0, evt — <5c 



Nint 



r c \/ c-t c i + 5 C 



tRF 



tRF, 



(5) 



where Nint[] stands for the nearest integer to the quantity in brackets. We 
refer to to, evt as the event-start time. 
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Fig. 11. Calibration of EmC time scale using e + e~ — ► 77 events: a) Distribution of 
Atof , b) Detail of the peak at v = 1/ £rf in the discrete Fourier transform of the 
Atof distribution, c) Atof as a function of z c \, d) Atof as a function of z c \ for a 
single peak in the Atof distribution, corresponding to a single value of Nbc- 



Soft clusters coming from the accidental coincidence of machine-background 
events with the e + e~ collision can arrive earlier than the fastest cluster from 
the collision event itself. To increase the reliability of the estimate of io.evt, the 
cluster used for its evaluation must also satisfy the conditions E c \ > 50 MeV 
and (x 2 cl + 2/ci) 1 ^ 2 > 60 cm. 



4-6 Track-to- cluster association 



The track-to-cluster association module establishes correspondences between 
tracks in the drift chamber and clusters in the calorimeter. 

The procedure starts by assembling the reconstructed tracks and vertices into 
decay chains and isolating the tracks at the ends of these chains. For each 
of these tracks, the measured momentum and the position of the last hit in 
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the drift chamber are used to extrapolate the track to the calorimeter. The 
extrapolation gives the track length L ex from the last hit in the chamber to the 
calorimeter surface, and the momentum p cx and position x cx of the particle at 
the surface. The resulting impact point is then compared with the positions x d 
of the reconstructed cluster centroids. A track is associated to a cluster if the 
distance to the centroid in the plane orthogonal to the direction of incidence 
of the particle on the calorimeter, D tc \ = |(x c i — x cx ) x p ex /|Pex||, is less than 
30 cm. For each track, the associated clusters are ordered by ascending D tc \ 
values. 

Various event-classification algorithms classify clusters as due to neutral or 
charged particles. Most of these algorithms treat clusters as due to neutral 
particles if no associated tracks are identified by the track-to-cluster associa- 
tion module. 

While the standard track-to-cluster association algorithm provides the infor- 
mation necessary to estimate the arrival time for a charged particle at the 
surface of the calorimeter, the interval between the time of particle incidence 
and the measured cluster-centroid time, A^EmC, can be significant, and must be 
taken into consideration in time-of-flight based particle-identification schemes. 
For example, for 7r + 's which interact deeply (25-30 cm) in the calorimeter, 
A^EmC can be as much as 1 ns, as compared to a time of flight of ~8 ns. This 
time interval directly reflects the temporal profile of the energy deposition for 
the incident particle, and varies by particle species. For each species (e + , e~ , 
fj, + , \i~ , 7r + , and 7T~), a simple, linear parameterization can be used to relate 
(AtEmc) to the depth of the centroid along the direction of particle incidence. 
Because of residual differences between the temporal shower profiles observed 
in data and simulated in the Monte Carlo, these parameterizations have been 
performed separately using data and MC events. They are available for use in 
calculating expected particle times of flight at the analysis level. 



4-7 Event classification 

The KLOE event-classification library is composed of different modules for 
the identification of the major physics channels at DA$NE. The main classi- 
fication algorithms include those for the identification of 

• generic background: beam background, cosmic-ray muons, and fragments of 
small-angle Bhabhas; 

• large-angle Bhabhas and e + e~ — > 77 events; 

• tagged Kl or Ks decays; 

• tagged K + or K~ decays; 

• (f) — > 7T + 7r~7r° decays; 
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• 7r + 7r + wy and fully neutral wy final states coming from various primary 
processes such as e + e~ — > 7r + 7r~7, e + e~ — > — > 777 or 7/7, e + e~ — > — > 
/o(980)7 or a (980)7, etc. 

Background events are discarded, while all of the other samples are separately 
archived (see also Sec. 3.1). In the following, we discuss the criteria used to 
identify events in each of these categories. 

The background-rejection algorithm is based on calorimeter clustering and 
DC hit counting, so that background events can be eliminated before DC re- 
construction, which is the most CPU-intensive section of our reconstruction 
program. For the identification of background events, cuts are applied on the 
number of clusters; the number of DC hits; the total energy in the calorimeter; 
the average polar angle, position, and depth of the (two) most energetic clus- 
ter (s); and the ratio between the number of hits in the innermost DC layers 
and the total number of DC hits. These cuts have been studied to minimize 
losses for physics channels. Additionally, a simple cut on anomalously high 
total energy deposits in the calorimeter is included to reject rarer machine- 
background topologies due to sporadic DA$NE beam-loss events. 

The KLOE trigger system includes a veto for cosmic-ray muons that uses 
dedicated thresholds on the energy deposition in the outermost layer of the 
calorimeter. Cosmic-ray events that survive the trigger veto (~0.6 kHz out 
of ~3 kHz) are rejected by the background filter by identification of at least 
one cluster pair with relative timing, total energy deposition, and energy re- 
leased in the outermost calorimeter layer consistent with those expected for a 
relativistic muon. 

Small-angle Bhabha electrons can strike the focusing quadrupoles and shower 
inside the magnets and/or the QCAL calorimeter. Fragments from these show- 
ers are sometimes sufficient to trigger the experiment. Events of this type are 
identified by the presence of spatially concentrated clusters on the endcap 
calorimeters that arrive within a narrow time window. 

Large-angle Bhabha and e + e~ — > 77 events are selected to calibrate the 
calorimeter and to evaluate the luminosity. These events are identified us- 
ing only calorimetric information. They must have at least two clusters with 
energy 300 MeV < E d < 800 MeV and polar angle between 45° < 9 < 
135°. These clusters must arrive within a narrow time window and have 
1 180° — 9i — 9 2 1 < 10°. A stringent cut on the angle between the two most 
energetic clusters, x 1 -x 2 /|x 1 ||x 2 | < —0.975, is used to separate 77 events from 
Bhabhas. 

A more precise measurement of the integrated luminosity is obtained by refin- 
ing the large-angle Bhabha event selection with track reconstruction informa- 
tion. In particular, the two tracks in the event with the greatest number of asso- 
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ciated DC hits must be of opposite charge and have momenta p > 400 MeV/c 
and polar angles 55° < 9 < 125°. The agreement obtained for the distributions 
of important quantities such as the energy and angle of the Bhabha clusters 
for data and Monte Carlo events (generated with babayaga [25,26]) demon- 
strates that the event counting in the fiducial angular region is accurate to 
the same level as the precision of the generator itself. 

At KLOE, it is possible to tag K$, Kl, K + , and K~ beams: the presence 
of a K s (K L ) signals the presence of a K L (K s ) on the opposite side of the 
detector, and the same applies for K +, s and if - 's. Pure K L beams are tagged 
by the identification of the Ks — > 7i + n~ decay. One charged vertex from two 
particles originating near the interaction point (IP) is required. Loose cuts 
on vertex position, particle momenta, and invariant mass are applied. The 
reconstruction of the K s decay allows the K L momentum to be predicted 
with a precision of better than 2 MeV/c. The overall tagging efficiency is 
~70%. Ks beams are tagged by Kl interactions in the calorimeter barrel. 
These interactions are signaled by high-energy clusters with typical arrival 
times of 30 ns due to the low momentum (110 MeV/c) of the kaons produced 
at DA$NE. K L clusters used to tag Kg's must have energy E ci > 100 MeV and 
velocity 0.17 < (3 < 0.28, and must not be associated to any tracks in the drift 
chamber. The Ks momentum is determined with a precision of better than 
2 MeV/c, as is also the case for the Kl beam. A Ks beam can also be tagged 
by looking for Kl — > 7r + 7r~7r° decays, which are identified by the presence of a 
vertex in the DC satisfying kinematic cuts, and two clusters from the 7r° — > 77 
decay. These clusters must satisfy opening-angle and time-of-flight cuts and 
must not be associated to any tracks in the DC. 

At KLOE, since the KsKl pairs from decay are initially in a pure, antisym- 
metric state, the final-state decay products show characteristic interference 
patterns. By studying the relative-time distributions for decays to different 
final states, it is possible to measure various CP- and CPT-violation param- 
eters [27]. The most interesting events for this type of analysis are those in 
which the Ks and Kl decays occur in close proximity to each other, i.e., 
both occur near the IP. In order to maximize the selection efficiency for such 
topologies, a dedicated algorithm has been developed. This algorithm searches 
for the presence of any combination of pairs of track and photon vertices that 
represent a possible pair of Ks and Kl decay modes. Good track vertices must 
have exactly two tracks of opposite charge. 

Events are selected for the charged-kaon sample by the identification of either 
a pair of candidate kaon tracks originating near the IP, or a K — > jiu or 
K — > 7T7r° decay in the DC. In the first case, two tracks of opposite charge 
with total momentum compatible with the decay kinematics are required. 
In the second case, the kaon decay is recognized as a charged vertex with 
two connected tracks of the same sign of charge. The vertex must lie within 
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Fig. 12. Distribution of |pi| + | p»2 1 vs. cp m i ss — -Emiss- a) <fi —> it + it~-k decays are 
in the bottom-left region of the plot, while 7r + 7r~(7), ^ + //~(7), and e + e~(7) events 
are concentrated in the top-right region, b) Enlarged view of the top-right region, 
showing the contributions from 7r + 7r~(7), fx + ("f) , and e + e~('j) events. 



40 < R xy < 150 cm, and the momentum of the secondary in the rest frame of 
the kaon must be within the range 180 < p* < 270 MeV/c. 

The — > 7T + 7r~7r° sample is obtained by searching for a vertex near the IP 
(R xy < 8 cm, \z\ < 15 cm) with two connected tracks of opposite charge. Cuts 
on the sum of the track momenta, p sum = |pi| + | p>2 1 , the missing momentum, 
Pmiss, and the missing energy, -E m i SS , are used to isolate the sample (see Fig. 12). 



The search for the final states fi + fi~ , e + e~) + rry requires one charged 

vertex near the IP with p sum < 1020 MeV/c, > 90 MeV, and cj9 miss — 
-E'miss > — 50 MeV. Different windows in p sum and the quantity cp miss — E miss 
are used to separate n/pi/e final states, as seen in Fig. 12. 

Fully neutral rry final states are identified by the presence of at least three 
clusters in the calorimeter that are not associated to tracks in the DC, and 
which have times of flight consistent with photon travel from the IP. 
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4-8 Redetermination of the event-start time 



As explained in Sec. 4.5, the event-start time t ^ evt , or equivalently, the in- 
teger number of bunch crossings A^c needed for trigger formation, must be 
determined offline by analysis of the cluster times. Before tracking and event 
classification, Abc is obtained by assuming that the earliest qualifying cluster 
in the event is due to a photon coming from the IP. This first determination 
allows the event to be reconstructed and classified by physics channel. How- 
ever, many physics channels contain no prompt photons in the final state, so 
this determination of Abc; and therefore, the corrected cluster times t^\ may 
differ from the actual times of flight by an integer number of bunch crossings 
AiV BC : 

t^ =t TOF -AN BC t RF . (6) 



For such events, it is usually possible to obtain the remaining correction term 
using a recognized topology associated to a cluster. The term needed is then 

AN BC t RF = ^-t^\ (7) 

pc 



where L is the estimated path length from the origin to the selected cluster, 
and j3 is evaluated using the relevant mass hypothesis. For example, if AAbc 
is evaluated from a primary track, j3 is evaluated from the track momentum. 
If the track associated to the cluster comes from a secondary vertex, the term 
Lj f3c becomes J2i Li/fac, where the sum is over the contributions from primary 
and secondary particles (including possibly photons). The times of all clusters 
in the event are then reevaluated as t c i = t^ + AAbc t RF . This procedure has 
been implemented for events classified as 

• charged kaons, by the identification of a K — > nn or K — > fiu decay; 

• neutral kaons, by the identification of a Kg — > 7r + 7r~ decay; 

• neutral radiative decays. 

For charged-kaon events, if the K — > tttt topology is recognized, the extrap- 
olations to the calorimeter of the clusters from the 7r° — > 77 decay and the 
charged-pion track can be used to determine AAbc- If instead the K — > p^v 
topology is recognized, AAbc is estimated from the momenta and lengths of 
the kaon and muon tracks. For neutral-kaon events with K s — > ty + ti~ de- 
cays, AiVBc is determined using the first pion to reach the calorimeter. Neu- 
tral radiative decays do contain prompt photons; the goal in redetermining 
the event-start time in this case is to correct situations in which A^bc is at 
first incorrectly determined because of the accidental coincidence of (a) beam- 
background cluster (s). For such events, if the second cluster with E c \ > 50 MeV 
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and R xy > 60 cm arrives more than 4 ns after the first, AA^bc is calculated 
using the second cluster. 

4-9 Reconstruction of photon vertices in Kl decays 

The positions of photon vertices from Kl decays are obtained from the cluster 
times. Each photon defines a time-of-flight triangle: the first side is the segment 
from the IP to the K L decay vertex, 1, K ; the second is the segment from the 
Kl decay vertex to the centroid of the calorimeter cluster, L 7 ; and the third 
is the segment from the IP to the cluster centroid, L. The direction L# is 
initially known because the Kl decay is tagged. The photon-vertex position 
is specified by the distance Lk, which is determined from 

L 2 + L\ - 2L • L K = L*, 

Lr/Pk + L 1 = ct- / , (8) 

where t 7 is the cluster time and (5k is the Kl velocity. 

For the evaluation of Lr, the Kl decay must be tagged by a K$ — > 7r + 7r~ 
decay. The direction of the K L is given by px L = P4, — Pk s > where is the 
mean <fi momentum as determined from Bhabha events in the same run. The 
position of the IP is obtained by backward extrapolation along the K s flight 
path. 

L K is evaluated for each neutral cluster with energy E c \ > 7 MeV. The energy- 
weighted average of the values of L K for each cluster is used as the final L K 
measurement. 

The accuracy in the location of the photon vertex has been studied using 
Kl — > 7T + 7r~7r° decays, in which the decay position can be independently 
determined using clusters and tracks, with much greater precision in the latter 
case. The dependence of the position resolution on decay distance is illustrated 
in Fig. 13. 



5 Monte Carlo: physics generators and detector simulation 

The KLOE Monte Carlo program, geanfi, is based on the geant 3.21 li- 
brary [28, 29] widely used in current high-energy and astroparticle physics 
experiments, geanfi incorporates a detailed description of the KLOE appa- 
ratus, including 
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Fig. 13. Resolution <t(Lk) on the determination of the Kl decay length using photon 
vertices, as a function of Lk, in Kl — > 7r + 7r~7r° events. The contributions from the 
uncertainties on the point of photon incidence on the calorimeter, the cluster time, 
and the Kl flight direction are shown as the dot-dashed, dashed, and dotted lines, 
respectively. 

• the interaction region: the beam pipe, the low-/? quadrupoles, and the QCAL 
calorimeters; 

• the drift chamber; 

• the endcap and barrel calorimeters; 

• the superconducting magnet and the return yoke structure. 

A set of specialized routines has been developed to simulate the response of 
each detector, starting from the basic quantities obtained from the geant 
particle-tracking and energy-deposition routines. In Sees. 5.3 and 5.4, we dis- 
cuss various aspects of the simulation of the DC and EmC response and com- 
pare performance results obtained using data and Monte Carlo events. 

5. 1 Generators for continuum processes and (f> production 

GEANFI contains the code to generate the physics of interest at DA$NE. The 
cross sections for the relevant processes in e + e~ collisions at y/s = 1.02 GeV 
are listed in Table 3. 

A precise Bhabha-event generator is required for the measurement of the 
DA$NE luminosity. To reach an accuracy of a few per mil for the effec- 
tive cross section, radiative corrections must be properly treated. BHAGEN, 
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Table 3 

Cross sections for several e + e~ interaction processes at y/s = 1.02 GeV. For the 
process e + e~ — > 0, the visible cross section is listed. 



an exact O (a) generator based on the calculations of Ref. 30, has been im- 
plemented in GEANFI from the very beginning. More recently, the BABAYAGA 
generator [25,26] has been interfaced with geanfi. This generator is based on 
the application to QED of the parton-shower method originally developed for 
perturbative QCD calculations. The generator takes into account corrections 
due to initial-state radiation (ISR), final-state radiation (FSR), and ISR-FSR 
interference, and has an estimated accuracy of 0.5%. babayaga can also be 
used to generate e + e~ — > and e + e~ — > events. 

KLOE can measure a(e + e~ — > 7r + 7r~) using e + e~ — > 7r + 7r~7 events in which 
the photon is radiated from the initial state. For this analysis, we use the 
phokhara 3 generator [31], which includes leading-order (LO) and next-to- 
leading-order (NLO) treatment of the ISR and FSR terms. NLO effects have 
been shown to have an impact on the precision achievable for the KLOE 
measurement of <x(e + e~ — > 7r + 7r~). A previous generator developed by the 
same authors, EVA [32], was based on LO calculations of the ISR and FSR 
diagrams, supplemented by an approximate inclusion of additional collinear 
radiation based on structure functions. KLOE can also generate events with 
EVA. The possibility of changing the structure functions has been used in our 
analyses of radiative <f> decays. 

The process e + e _ — > con is simulated with all u> decay modes enabled, the u 
width taken into account, and a 1 + cos 2 9 dependence assumed for the un 
angular distribution. In particular, the process e + e~ — > ujtt with 00 — * 7r°7 is 
one of the background channels for the analysis of the decays — > fo (980)7 
and a (980)7; it is treated according to the VDM matrix element described in 
Ref. 33. 
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Fig. 14. Dependence on y/s of the cross section for 0- meson production and de- 
cay into each of the major modes, K + K~ , K$Kl, fnr, and 777. Curves show the 
parameterization used in the MC; points are KLOE measurements from 2002. 

The simulation of 0-meson production and decay includes the production of 
ISR photons by the interacting beams. The ISR generator is based on the 
Kleiss formalism discussed in Ref. 34, in which it is shown that the O (a) 
radiative corrections completely factorize from the lowest-order interaction 
cross section. The effects of hard, soft, and virtual bremsstrahlung photons are 
taken into account (hard photons, with E > 1 MeV, are explicitly simulated) 
by multiplying a photon-emission factor with the nonradiative cross section 1 
evaluated at an effective CM energy that depends on the hardness of the ISR 
photon. The MC dependence on y/s of the cross section for <p production 
followed by decay into each of the dominant modes (K + K~, K S K L , pir, and 
777) is shown in Fig. 14 and compared with KLOE measurements conducted 
in 2002. 



5.2 Generators for meson decays 

The routines in the geant library simulate two- and three-body decays ac- 
cording to pure phase-space distributions. Only the main decay modes of 
muons, pions, kaons, and 77 mesons are simulated. We have enriched the list 
of simulated particle-decay modes to include rare decays and refined the kine- 
matic distributions of the secondaries to include the correlations expected from 
the matrix elements for the different decay processes. 

The generator for events discussed in Sec. 5.1 selects the decay channel 
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and declares the decay products to geant. Initial-state reactions and the 
beam-energy spread of the machine ( Ai?b eam /.E'b eam = 0.04% at DA$NE) are 
taken into account event by event in the simulation of the decay kinematics. 

For — > K + K~ and K$Kl decays, the kaons are distributed as dN/ d cos 9 oc 
sin 2 9 in the polar angle. 

In the — > pir channel, the p decays dominantly to tttt; other possible p 
decays are to 717, 777, and n^rj. The three-body phase space of the secondaries 
is modified assuming a Breit-Wigner shape for the p resonance, with m p = 
776.1 MeV/c 2 and T p = 145.6 MeV/c 2 for all three p charge states. In <p -> pn 
decays with p — > 7T7t, the contribution from direct <fi — > 37r decay and the 
interference between direct and p-mediated process are simulated, using the 
values measured by KLOE for the relative contributions from each term [35]. 

Scalar mesons from radiative <fi decays are distributed as dN j dcos# oc 1 + 
cos 2 9 in the polar angle and are generated by a separate set of routines, which 
in some cases (e.g., the EVA generator, customized for KLOE) offer a choice 
of production models. 

Besides the major modes, the list of neutral-kaon decays simulated includes 
rare decays such as K s — > -irlv, K s — > /T + 7r~7r°, and K s — > 7r°7r 7r°. 

For the simulation of semileptonic kaon decays, kaon decays into two pions, 
and leptonic decays of charged kaons, radiative corrections are taken into 
account. In order to avoid problems with divergences at low radiated-photon 
energy, we use the method of Ref. 36 to sum the amplitudes for virtual and 
real radiative processes to all orders of a. We have verified that the soft-photon 
approximation used in this treatment is valid for the entire range of photon 
energies in the kaon decays of interest. Whenever a decay is generated in which 
the radiated energy is more than 0.1 MeV, a final-state photon is explicitly 
simulated. 

The Dalitz plots for the K — > 37r decays are generated according to the form 
|M| 2 = l+gu + hu 2 +jv +kv 2 , where u = (S3 — s )/ml. + and v = (si — s 2 )/m 2 + , 
while Si = (Pk — Pi) 2 and s = (1/3) X) Sj. The values of the parameters g, h, 
j, and k used in the simulation are those published by the PDG [37]. 

The 7T° decays simulated include the Dalitz decay 7r° — > e + e~7. All decay 
modes of the r\ and r( mesons are simulated. 
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5.3 Drift chamber simulation 



The chamber geometry as simulated consists of a cylindrical carbon-fiber and 
aluminum inner wall, a cylindrical carbon-fiber outer wall, and two spherical 
carbon-fiber endplates. The average material burden contributed by the read- 
out electronics installed on the endplates is also taken into account. The two 
stiffening rings at the edges of the endplates and the 12 carbon-fiber struts are 
simulated as well. In order to reduce CPU time consumption, the 52 000 wires 
are not described in the geant geometry as volumes, but their presence is 
taken into account at the tracking level. All parameters used to describe the 
chamber geometry are stored in the database. 

Tracking in the drift chamber is performed by a dedicated package that uses 
standard geant routines for particle propagation and for interactions in the 
medium. The cell geometry is calculated for each tracking step using the wire 
positions and stereo angles stored in the database; the wire sags are also taken 
into account. When a particle hits a wire, a multiple-scattering simulation 
using the appropriate wire material is performed. The energy loss in each cell 
is also computed. 

For each cell crossed, the program computes the distance of closest approach 
between the track helix and the nearest sense wire. These distances are con- 
verted to drift times using s-t relations that are parameterized as described 
in Sec. 4.2. The constants describing the s-t relations used for this conversion 
are obtained from a detailed simulation of the electron drift performed with 
the GARFIELD program [38]. 

At the digitization stage, the TDC-signal arrival time is calculated, with the 
drift time, the particle time of flight, and the propagation time of the signal 
along the wire taken into account. For cells crossed by more than one particle 
(or more than once by the same particle), only the signal coming from the 
first hit is registered. The raw signal arrival times are then written to output 
banks that serve as the input to the reconstruction program. An algorithm for 
digitization of the charge values for each wire to simulate the measurements 
from the recently installed ADCs is currently under development. 

The drift-chamber reconstruction of simulated data is essentially identical to 
that of real data, with two notable exceptions. First, a dedicated reconstruc- 
tion module allows hits on dead channels to be deleted (the configuration of 
dead channels during data taking is stored in the database run by run). Sec- 
ond, the s-t relations used for the track reconstruction are obtained by the 
calibration procedure described in Sec. 4.2, using simulated cosmic-ray events. 
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5.4 Calorimeter simulation 



In order to reduce CPU consumption, the GEANT representation of the calorime- 
ter geometry does not include a detailed description of the individual fibers 
embedded in the grooved lead plates. An approximate geometry consisting of 
thin, alternating layers of lead and scintillator is used instead. 

The starting point for the simulation of the EmC response is the energy de- 
position of the incident particle in the active material, AE. The light yield 
collected at each end of a calorimeter module is calculated by correcting AE 
as a function of the point of impact along the fibers to account for light at- 
tenuation. The resulting energy is converted into a number of photoelectrons, 
iVp e , using an average value for the light-yield conversion constant, Ymcj an d 
applying Poisson statistics to simulate the fluctuations. 

To each photoelectron, a time is assigned by adding scintillation and light- 
propagation times to the arrival time of the particle. The number of photo- 
electrons and the photoelectron times are accumulated for each detector cell, 
i.e., for the entire volume viewed by each individual PMT. The energy mea- 
sured for each PMT is obtained by dividing the total number of photoelectrons 
by Ymc- The final PMT time measurement is obtained from the time distri- 
bution of the photoelectrons collected. In order to simulate the behavior of 
the constant-fraction discriminators used in the experiment, this time is set 
to the value corresponding to the integration of 15% of the complete signal. 

We have made extensive use of <fi — > 7r + 7r~7r° events in tuning the simulation 
of the calorimeter. In such events, the energy and momentum of one of the 
photons can be accurately predicted from the reconstruction of the ix + it~ 
vertex and the position alone of the cluster from the other photon. No other 
calorimetric information is needed. 

To establish the thickness of the lead and scintillator planes in the simulated 
geometry, we have minimized the differences between the shower shapes for 
photons in data and MC events. Using — > 7r + 7r~7r° events in the data set, the 
distribution of the depth of the first plane fired by incident photons of given 
energy E 1 and polar angle # 7 has been fit with a discretized exponential func- 
tion with mean-depth parameter A. In Fig. 15a, the dependence of A on E 1 is 
shown for different values of 9 y . The distributions flatten above 200 MeV, as 
expected when the cross section for e + e~-pair creation approaches the plateau 
limit corresponding to an interaction length of 7/9 X . The plateau values of 
the interaction length for different # 7 intervals shown in Fig. 15b correspond 
to values for Xo of ~1.2 cm. This is in reasonable agreement with the radia- 
tion length estimated a priori from the known composition of the calorimeter 
modules [2]. Using the same technique, we have also measured the effective 
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Fig. 15. a) Mean interaction- length parameter A as a function of photon energy 
for different intervals in the photon polar angle 6* 7 . b) Limiting (plateau) values of 
A for different intervals in 0~ 



/ 7 . 



radiation length in the Monte Carlo and varied the relative thickness of the 
lead and scintillator planes in order to establish agreement with data. This 
procedure leads to a representation of the calorimeter module as 220 layers of 
480 /zm of lead plus 620 /im of scintillator. 

To calibrate the calorimeter response, we have used — > 7r + 7r~7r° events with 
particles crossing the center of the calorimeter modules (s = s ) to determine 
the average light-yield conversion constant for data, Y, as a function of the 
energy of the incident particle. The relation between Y and Ymc is Yuc — 
Ygi(so)/ f e (recall that gi is the correction factor for light attenuation in the 
fibers of the i th cell; f e is the sampling fraction for electromagnetic showers). If 
Poisson statistics dominate the fluctuations in the energy response, we expect 
the distributions of the ratios E A /E B and (E A - E B )/(E A + E B ), where the 
values E ,B refer to the energy measurement at each side of the module, 
to have variances a = ^2/N pe . We obtain Y = 0.6-0.7 p.e./MeV per side. 
This has led us to set Ymc = 19 p.e./MeV in the most recent version of the 
MC. After these adjustments, reasonable agreement between MC and data 



39 



f a) 
-l 

-2 
■3 
■4 
-5 




_l I I l_ 



_l I I l_ 



50 



100 



150 



200 



250 



300 



E y (MeV) 




300 



E y (MeV) 

Fig. 16. Energy response and resolution determined using <p — > ir + Tr~ir° events with 
photons incident on the calorimeter barrel: a) relative linearity of energy response, 
AEry/Ery, as a function of E^\ b) relative energy resolution, a(E^)/E^, as a function 
of Ej. Solid (open) circles are for data (MC). 



is observed for the energy response and resolution as a function of E 1 (see 
Fig. 16). 

With the geometry and response of the calorimeter thus simulated, assuming 
that the visible energy follows the spectrum of energy loss inside the scintilla- 
tor, we obtain sampling fractions f e = 11% for electromagnetic showers, and 
fn = 18% for minimum-ionizing particles. The ratio f e /fn = 0.6 is 20% lower 
than the value measured using a test beam. The same discrepancy between MC 
and data has been found for the position of the minimum-ionizing peak from 
the most energetic pions in — > 7r + 7r~7r° events. Samples of e + e~ — > [i + ^(7) 
and — > 7r + 7r~7r° events in data are currently being used to adjust the average 
energy loss of pions and muons in the scintillator in order to obtain good MC- 
data agreement on the calorimeter energy response over the entire momentum 
range of interest. 

The effect of the cracks between the barrel modules is illustrated in Fig. 17, 
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Fig. 17. Energy response as a function of the azimuthal distance from the boundaries 
between modules of the barrel calorimeter. Solid (open) circles are for data (MC). 

which shows the ratio (E c \ — E 7 ) / ' as a function of azimuthal distance from 
the module boundaries for photons from (j) — > 7r + 7r~7r events. A clear deterio- 
ration in the response is observed within ±1° of the module boundaries. This 
effect is due to fibers broken during the final milling of the modules, and it is 
not easy to include in the MC given the representation of the geometry in use. 
We simulate this effect during event reconstruction by weighting the recon- 
structed energies with a function of the azimuthal positions of the generated 
hits. A similar effect is observed in the endcaps; in this case, the magnitude 
of the effect is smaller, and it is not yet corrected for. 

For the time simulation, the scintillation curve for single photoelectrons has 
been tuned to reproduce the stochastic contribution to the timing resolution of 
54 ps/ \/E(GeV). The MC-data agreement after the adjustment is reasonable. 
The constant contribution to the timing resolution observed in data, ~140 ps, 
is mostly due to jitter introduced when rephasing the trigger with the machine 
RF signal. To simulate this effect, an offset sampled from a Gaussian with a 
width of 140 ps is added in common to all time signals in the event. 

5. 5 Trigger simulation 

The KLOE trigger is emulated in software during event reconstruction. Non- 
triggering events are retained in the output, but the result of the trigger em- 
ulation is encoded in the data stream, allowing MC estimates of the trigger 
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efficiency to be obtained. 



For the emulation of the EmC trigger, the energy deposited in each calorimeter 
element and the PMT-signal arrival times are first read out. For each trigger 
sector, the energies of all cells fired within a coincidence window of 3.5 ns 
are summed, where this interval approximately corresponds to the width of 
the actual PMT signals. By comparison to a set of discriminators reproducing 
the hardware circuitry, these sums are transformed into logic signals of 70-ns 
duration. Three different sets of thresholds are used to distinguish (f> decays, 
Bhabha events, and cosmic-ray events. The threshold values are determined 
from the analysis of real data on a run-by-run basis. The resulting logic signals 
are used to compute the multiplicity of hit sectors on the barrel and each of the 
two endcaps, and finally combined to produce the 0, Bhabha, and cosmic-ray 
trigger signals. 

The signals from the DC wires are read out and shaped at 250 ns. As in the 
hardware, the signals from wires in different groups of adjacent DC planes 
are summed. These "superlayer" signals are then summed in turn to get the 
effective DC multiplicity as a function of time. A level- 1 DC trigger is set 
whenever this sum exceeds a given threshold. The sum is then integrated over 
a 1.2 /is interval and compared to another threshold to define the level-2 DC 
signal. The values of these two thresholds are determined from the analysis of 
real data on a run-by-run basis. 

Finally, the DC- and EmC-trigger signals are combined to deliver the final 
level- 1 and level-2 trigger decisions with the correct timing relative to the 
start of the event as generated. Once the trigger time has been simulated, 
it is rounded to the next highest multiple of 4t RF to simulate the rephasing 
of the experiment's level-1 trigger with the machine clock. A time interval 
corresponding to an integer number of bunch crossings from one to four is 
then subtracted from the rephased trigger time; this corresponds to randomly 
specifying the particular bunch crossing that produced the event. The result 
is the simulated value of £o,evt! this value is then applied to the times of all 
calorimeter and drift-chamber hits. 



5. 6 Machine background simulation 

A detailed simulation of detector activity from the accidental coincidence of 
hits from machine background is required in order to obtain the high precision 
and careful control of systematics needed for most KLOE physics analyses. 
This activity consists mainly of noise hits in the DC and low-energy clusters 
in the EmC, mostly at small angles. Background hits in the chamber and 
calorimeter are added to the simulated events at the reconstruction stage. 
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For the 2001-2002 data, this background was obtained from e + e~ — > 77 events 
satisfying specific topological cuts. These events are selected from KLOE data 
with a cross section of ~40 nb. Since e + e~ — > 77 events are fully neutral, all 
DC hits in these events are considered background, in addition to all EmC 
clusters not identified as belonging to the 77 topology (care is taken to cor- 
rectly distinguish clusters from initial state radiation or from cluster splitting, 
which actually belong to the 77 topology, from those due to machine back- 
ground). 

A file containing background hits is created for each raw file in the data 
set. As discussed in Sec. 5.7, an MC run corresponds to a set of raw files in 
data. We insert the hits from each event in the set of background files into 
multiple events in the corresponding MC run, with a reuse factor chosen to 
ensure that all background events are used roughly the same number of times. 
This ensures reproduction of the time-variable background spectrum in the 
simulated output. 

For both the EmC and DC, when hits are inserted, their timing relative to 
the start time of the 77 event from which they were extracted is preserved. 
The insertion takes place before the trigger simulation is performed, so that 
simulated and inserted hits are temporally aligned. Hit-blocking effects are 
reproduced. In the drift chamber, a background hit that arrives earlier than a 
simulated hit on the same wire causes the simulated hit to be removed from 
the event, and vice versa. On the calorimeter, if both a background hit and a 
simulated hit occupy the same cell, the earlier arrival time on each side of the 
cell is retained, while the energy read out at each side is taken from the sum 
of the two hits. The trigger simulation is then performed, and the simulated 
and inserted hits are then t -smeaied simultaneously using the algorithm of 
Sec. 5.5. 

For the drift chamber, the s-t relations used for simulated events and for 
real data are sufficiently similar so that all hits — simulated and inserted — 
can be reconstructed with the MC s-t relations. A correction is made to the 
energy scale when calorimeter hits are inserted. This correction ensures that 
the inserted calorimeter hits reconstruct with the same energy that they had 
in the data event from which they were extracted. 



5. 7 Monte Carlo production campaigns 

An extensive simulation campaign for the 2001-2002 KLOE data set is cur- 
rently near completion. This campaign is focused on the production of general- 
purpose samples, such as samples in which all decays of the <fi are present in 
proportion to their natural branching ratios, or in which the always decays 
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to KsKl but all possible final states are present. Such samples are particularly 
useful for understanding backgrounds in studies of rare decays. The produc- 
tion procedure is geared towards providing high-statistics samples. The total 
number of events in each sample is established using an effective luminosity 
scale factor, which ranges from 0.2 for general-purpose simulations such as 
4> — > all (peak cross section ~3.1 fib), to 5 for dedicated simulations such 
as e + e~ — > 7r + 7r~7 (cross section ~50 nb). In all, current plans call for the 
production of about 10 9 events. 

In order to track run-by-run variations in the operating conditions of the 
collider and detector, an MC sample is generated for each run in the data set, 
with the number of events proportional to the integrated luminosity of the run 
under simulation, and such parameters as machine energy, momentum of the 
collision center-of-mass, beam-spot position, map of dead detector elements, 
and trigger thresholds set to correspond to the run conditions. Background 
hits in the EmC and DC are inserted with special care in order to ensure 
reproduction of the background spectra resulting from variations with time 
within each run (see Sec. 5.6). As a result of these procedures, time- variable 
conditions are correctly averaged in the sample of MC events corresponding 
to any given group of runs in the data set. 

For the production of a given sample, one job is submitted for each run in 
the data set. A production job handles generation, reconstruction, and DST 
creation. In order to have intermediate files of reasonable size, it is usually 
necessary to split the generation into several processes. A reconstruction pro- 
cess immediately follows each generation process. DSTs are made after all 
generation and reconstruction is complete. Production is started by submit- 
ting a large number of jobs to a batch queue managed by IBM's LoadLeveler 
utility [16]. 

When MC events are reconstructed, several algorithms intended to complete 
the simulation are run before any of the actual reconstruction algorithms. 
Background hits in the EmC and DC are first inserted. Hits on dead wires of 
the drift chamber are next removed. The trigger emulator is then run, after 
which hits on hot drift-chamber wires can be removed (in the reconstruction 
of real data, they are removed at the input stage). Finally, the t - smearm g 
algorithm is applied. After these steps, the same algorithms used for the re- 
construction of real data are run, in the same order described in Sec. 3.1. 

The only other special treatment given to MC events concerns the behavior 
of the machine-background filter and the event-classification module. Like the 
trigger emulator, these modules only record their decisions in the output file; 
they do not actually suppress events. In particular, MC events are not divided 
up into streams at the reconstruction stage; only one reconstruction output 
file is produced from each generator output file. 
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Program 


Hi vents (iu ) 


trU time todays J 


Uutput size ^IjjJ 




255 


1100 


6.9 




410 


1800 


11.0 


e + e~ — ► 7T7T7 


36 


110 


0.8 



Table 4 

Statistics for some Monte Carlo production campaigns completed to date 



The reconstruction output file contains enough information to allow recovery 
of the events as generated, before the introduction of background hits. There- 
fore, only the reconstructed output files are archived; generator output files 
are discarded. 



In the last stage of the production job, DSTs are produced. The same five types 
of DSTs as for real data can be produced for MC events, with the application 
of the same stream-specific algorithms described in Sec. 3.2. However, for the 
production of dedicated MC samples (e.g., for the process e + e~ — > 7r + 7r~7), 
only the DST types of interest are produced. Event streaming is performed 
at the DST-production stage. In addition to all events classified on the ba- 
sis of reconstructed quantities, each MC DST stream contains all events with 
topologies as-generated relevant to the physics of the stream. MC DSTs also 
contain a minimal set of information about the true event topology. All in- 
formation in the GEANT KINE and VERT banks is present in the DSTs, but 
there is no information about individual hits. In place of the hit banks them- 
selves, the correspondences between reconstructed topologies (clusters, tracks) 
and simulated particles (KINE tracks) are kept. Like data DSTs, MC DSTs 
are archived and recalled to the NFS-mounted disk cache for prompt access. 



Table 4 gives a statistical summary of the Monte Carlo production campaigns 
completed to date. In the two general-purpose production campaigns, <fi — > all 
and (j) -> K S K L , the entire 2001-2002 data set (~ 450 pb" 1 ) was simulated 
at luminosity scale factors of 0.2 and 1, respectively. Events such as these 
require 200 ms to generate and 175 ms to reconstruct on the CPUs in the B80 
servers; the running times in the table were obtained with 60 CPUs. For the 
e + e~ — > 7T7T7 campaign, only the 2001 data were simulated (~ 170 pb _1 ), at a 
luminosity scale of 5. In these three campaigns, a total of 7 x 10 8 events were 
produced in about three months of real time. 
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6 Conclusions 



The high event rate at DA<I>NE — 1.5 kHz of <fi decays accompanied by a similar 
yield of Bhabha events within the acceptance, which must be downscaled, and 
of machine-background and cosmic-ray events, which must be rejected — has 
required us to design and operate a large, complex, and reliable system for 
data acquisition and offline data processing. 



The DAQ system described in Ref. 5 has guaranteed a bandwidth of 3 kHz 
during data taking, while simultaneously handling various tasks related to 
data-quality control and subdetector calibration and monitoring. At present, 
the mass-storage and data-handling systems manage over 100 TB of raw data 
and a comparable amount of reconstructed data both from the detector and 
from the experiment's Monte Carlo. 



We have carefully designed and optimized the offline software environment to 
ensure that data is reconstructed immediately following acquisition. As part 
of this effort, we have developed various tools for detector calibration, access 
to reconstructed data, process scheduling, and the like. We have placed special 
emphasis on maximizing the efficiency and precision of the reconstruction pro- 
gram. As a result, the performance specifications of the detector — momentum 
and vertex resolution for the drift chamber and energy and time resolution for 
the calorimeter — have been fully satisfied. 



At the same time, we have implemented a continuing series of improvements 
to the simulation of the detector response, the representation of machine back- 
ground, and the accuracy of the physics generators in the experiment's Monte 
Carlo. As a result of this development program, excellent agreement between 
data and Monte Carlo has been obtained for the distributions of key variables, 
and the Monte Carlo has become a reliable and important tool for physics 
analysis. 



We have dedicated a significant amount of work to the construction of a stable, 
scalable data-processing system with the flexibility to exploit all of the avail- 
able resources. During the past four years of operation, we have implemented 
a series of important upgrades to keep pace with the growing demands of the 
experiment. With the upgrades already scheduled for 2004, the environment 
will be well suited to handle predicted increases in the DA$NE luminosity. 
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